Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added: NonLinMPC and MovingHorizonEstimator integration with DI.jl #174

Merged
merged 33 commits into from
Mar 20, 2025

Conversation

franckgaga
Copy link
Member

@franckgaga franckgaga commented Mar 12, 2025

Nonlinear MPC and moving horzion estimator now use DifferentiationInterface.jl

Both NonLinMPC and MovingHorizonEstimator constructor now accept a gradient and jacobian keyword arguments, receiving a AbstractADType to switch the backend of the objective gradient and constraint Jacobians, respectively. Note that the nonlinear inequality and equality constraints will both use the backend provided by the jacobian argument. The possibility to provide a hessian will be added in a future PR (the LBFGS approximation is used inside Ipopt for now, as it was the case before)

By default, denses AutoForwardDiff() is use everywhere, except for NonLinMPC with non-SingleShooting transcription method, in which this backend is used:

AutoSparse(
    AutoForwardDiff(); 
    sparsity_detector  = TracerSparsityDetector(), 
    coloring_algorithm = GreedyColoringAlgorithm()
)

For both objects, the differentiation make use of DifferentiationInterface.Constant() and DifferentiationInterface.Cache() functionalities. As a consequence, they does not rely on PreallocationTools.DiffCaches anymore. But, I still need to keep this dependency since linearize! and all DiffSolvers still rely on them for now.

Documentation now use DocumenterInterLink.jl

Some URL were very long in the docstrings. The are now way shorter. It also ease the maintenance for the documentation of external dependencies.

The doc preview is here: https://juliacontrol.github.io/ModelPredictiveControl.jl/previews/PR174

@codecov-commenter
Copy link

codecov-commenter commented Mar 18, 2025

Codecov Report

Attention: Patch coverage is 99.32432% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.87%. Comparing base (c12d7bc) to head (990f9c5).

Files with missing lines Patch % Lines
src/controller/execute.jl 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #174      +/-   ##
==========================================
+ Coverage   98.85%   98.87%   +0.01%     
==========================================
  Files          25       25              
  Lines        4180     4160      -20     
==========================================
- Hits         4132     4113      -19     
+ Misses         48       47       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gdalle
Copy link

gdalle commented Mar 18, 2025

Heads up, DI v0.6.46 supports nested tuples and named tuples of arrays as caches

@franckgaga
Copy link
Member Author

Thanks for the info, right now I do:

mycontext = (
    Constant(a),
    Cache(b),
    Cache(c),
    Cache(d),
)
grad_prep = prepare_gardient!(myfunc, grad_backend, x, mycontext...; strict=Val(true))

and a similar splatting for the context in gradient!. Do you this splatting is problematic for the performance ? I did some quick benchamarks and I measured no impact at all.

@gdalle
Copy link

gdalle commented Mar 18, 2025

I don't think it is problematic, but for some users it is a bit painful: when you have tens of caches you may want to structure the more neatly, so I give that option too.

@franckgaga
Copy link
Member Author

Alright thanks! I thinks it's neat enough in the code like that.

This is "safer" like that. And it allows the user to verify what are the DI backends in a given MPC/MHE object. Also useful for the tests.
debug: temporarily fake a "filled" window for MHE gradient preparation (needed for some backend like FiniteDiff)
@franckgaga franckgaga requested a review from baggepinnen March 18, 2025 21:21
@franckgaga
Copy link
Member Author

franckgaga commented Mar 18, 2025

@baggepinnen It's ready to be merged, if you have some comments. You can have a look a NonLinMPC and MovingHorizonEstimator documentation (additionnal details in their respective Extended Help)

edit: I'm running the benchmarks of our papers in this version, and there is no significative difference on the performances, good news ❤️

@franckgaga franckgaga linked an issue Mar 19, 2025 that may be closed by this pull request
@gdalle
Copy link

gdalle commented Mar 19, 2025

edit: I'm running the benchmarks of our papers in this version, and there is no significative difference on the performances, good news ❤️

If you want to play around some more, a parameter that might impact performance is the coloring algorithm used by SparseMatrixColorings, more specifically the vertex ordering. Feel free to switch the default NaturalOrder() and see if it makes a difference with SMC.ncolors(jacobian_prep). Roughly speaking, the sparse autodiff complexity scales with the number of colors $c$ (or with $\lceil c / 12 \rceil$ for ForwardDiff).

@franckgaga
Copy link
Member Author

franckgaga commented Mar 19, 2025

If you want to play around some more, a parameter that might impact performance is the coloring algorithm used by SparseMatrixColorings, more specifically the vertex ordering. Feel free to switch the default NaturalOrder() and see if it makes a difference with SMC.ncolors(jacobian_prep). Roughly speaking, the sparse autodiff complexity scales with the number of colors c (or with ⌈ c / 12 ⌉ for ForwardDiff).

I did not played a lot with sparse jacobians for now. The pendulum exemples in our paper was small in term of number of decision variables, thus it is expected that no matter the chosen coloring/compression/etc algorithm, it will always be a bit slower then a dense formulation (for the same AD backend). It's like classical matrix operations e.g. product: if the matrices are relatively small, dense operations are generally faster even if they include many zeros.

And that was indeed the case from my quick tests. That is, on the inverted pendulum model with a control horizon of $H_c=2$ and a prediction horizon of $H_p=20$, the fastest to the slowest:

  1. Single shooting transcription (dense formulation of the problem) with dense ForwardDiff, about 12 ms per time step
  2. Multiple shooting transcription (sparse formulation of the problem) with dense ForwardDiff, about 20 ms per time step
  3. Multiple shooting transcription (sparse formulation of the problem) with sparse ForwardDiff, about 29 ms per time step

I may improve a little bit 3. by tweaking the colouring algorithm here, but IMO that's not the good case study to adequately benchmark sparse computations.

@franckgaga
Copy link
Member Author

franckgaga commented Mar 19, 2025

@gdalle I just read that for Constant() context some backends requires Number or AbstractArray.

Right now I pass a mpc::NonLinMPC object in the Constant arg. I understand that it won't work with some backends ? Do you know which one ?

@gdalle
Copy link

gdalle commented Mar 20, 2025

I think it will only be symbolic backends. I should update the docstring. Not a hard limitation either, I just didn't get around to it.

@franckgaga
Copy link
Member Author

Okay, I just tested it and yes AutoSymbolics does not work with non-AbstractArray in Constant. But it does not support Cache neither, and I use them extensively also (with normal Vectors).

I can use a closure and capture mpc to avoid passing it as a Constant. There is a small impact on the performance tho. But I don't have a workaround my Cached vectors (except by manually reallocating at each call).

I will leave it as is for now.

@gdalle
Copy link

gdalle commented Mar 20, 2025

But it does not support Cache neither, and I use them extensively also (with normal Vectors).

Did you find a bug there? I think the docs might be out of date, Caches with numbers or arrays should be supported by Symbolics too:

julia> using DifferentiationInterface

julia> import Symbolics

julia> f(x, c) = sum(copyto!(c, x))
f (generic function with 1 method)

julia> gradient(f, AutoSymbolics(), [1.0], Cache([0.0]))
1-element Vector{Int64}:
 1

@gdalle
Copy link

gdalle commented Mar 20, 2025

Okay, I just tested it and yes AutoSymbolics does not work with non-AbstractArray in Constant.

The reason why symbolic backends (and ReverseDiff) are more difficult for handling constants is that they generate an executable (or tape) for the function during preparation. If we close over the constants during preparation, they will be hardcoded into the executable. Therefore, we need to turn the constants into symbolic variables as well, so that the user is free to give different values at runtime.

I guess theoretically we could have a separation between "constant" Constants (those which are the same at preparation and execution) and "variable" Constants (those which differ between preparation and execution). But you can probably see how that would be confusing for the casual user, unless we come up with better names ^^

@gdalle
Copy link

gdalle commented Mar 20, 2025

But it does not support Cache neither, and I use them extensively also (with normal Vectors).

Did you find a bug there? I think the docs might be out of date, Caches with numbers or arrays should be supported by Symbolics too:

julia> using DifferentiationInterface

julia> import Symbolics

julia> f(x, c) = sum(copyto!(c, x))
f (generic function with 1 method)

julia> gradient(f, AutoSymbolics(), [1.0], Cache([0.0]))
1-element Vector{Int64}:
 1

Edit: I remember now, I marked Caches as non-supported by Symbolics because tests fail for the Jacobian only:

https://github.com/JuliaDiff/DifferentiationInterface.jl/blob/cfab84dd775f6b7e9b39d0d0f84f0d260fc4fe09/DifferentiationInterface/test/Back/SymbolicBackends/symbolics.jl#L22-L27

@franckgaga
Copy link
Member Author

franckgaga commented Mar 20, 2025

But it does not support Cache neither, and I use them extensively also (with normal Vectors).

Did you find a bug there? I think the docs might be out of date, Caches with numbers or arrays should be supported by Symbolics too:

julia> using DifferentiationInterface

julia> import Symbolics

julia> f(x, c) = sum(copyto!(c, x))
f (generic function with 1 method)

julia> gradient(f, AutoSymbolics(), [1.0], Cache([0.0]))
1-element Vector{Int64}:
 1

Maybe, AutoSymbolics simply throws an error is I leave the Constant(mpc). If I remove the Constant(mpc) but leave the multiples Cache(vector_a), Cache(vector_b), ... it return a derivative but it seems imprecise since Ipopt suddenly returns error code like "NUMERICAL_ERROR" and "ALMOST_LOCALLY_SOLVED" stuff like that (or maybe it is precise at some points of decision variable and not other, hard to say) . There is no error code at all with AutoForwardDiff and AutoFiniteDiff.

Hard to troubleshoot this kind of problem. It feels like I just entered a rabit hole haha.

@franckgaga
Copy link
Member Author

Okay, I just tested it and yes AutoSymbolics does not work with non-AbstractArray in Constant.

The reason why symbolic backends (and ReverseDiff) are more difficult for handling constants is that they generate an executable (or tape) for the function during preparation. If we close over the constants during preparation, they will be hardcoded into the executable. Therefore, we need to turn the constants into symbolic variables as well, so that the user is free to give different values at runtime.

I guess theoretically we could have a separation between "constant" Constants (those which are the same at preparation and execution) and "variable" Constants (those which differ between preparation and execution). But you can probably see how that would be confusing for the casual user, unless we come up with better names ^^

Yeah, this "nomenclature issue" crossed my mind during the implementation. And I literally use both "defintion" in the package right now: Constant(mpc) is literally a constant that won't change at runtime.

In the linearize implementation of #179, the Constants are vectors that are not mutated during the differentiation but their value is different at each jacobian! call.

Worst idea ever: a new context called ConstantConstant() 😆

@franckgaga
Copy link
Member Author

Okay, I'm a bit ambivalent on if should keep the Constant(mpc) or not.

My feeling right now: since it does not need any modification at runtime (same object at each jacobian! call), it's a "constant" Constant and I can use the closure trick. Theoretically, if AutoSymbolics improve in the future, Cache with normal vectors will be well supported, and it will work well for gradient, jacobian etc.

Conclusion: I will remove Constant(mpc) (and same thing for MHE).

@gdalle
Copy link

gdalle commented Mar 20, 2025

My feeling right now: since it does not need any modification at runtime (same object at each jacobian! call), it's a "constant" Constant and I can use the closure trick.

I tend to agree with that. Essentially, if your constant never changes, it's part of your function, and then why should DI give a shit ^^

Theoretically, if AutoSymbolics improve in the future, Cache with normal vectors will be well supported, and it will work well for gradient, jacobian etc.

Looking at it again, I think it is a bug in Symbolics. Will try to dig up an MWE

By using a closure on `mpc::NonLinMPC` and `estim::MovingHorizonEstimator`
@franckgaga
Copy link
Member Author

@baggepinnen I will merge the PR, you can comment here if you spot anything not clear or sus

@franckgaga franckgaga merged commit bb68996 into main Mar 20, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use DifferentiationInterface.jl
3 participants